A Binary Factor Graph Model for Biclustering

نویسندگان

  • Matteo Denitto
  • Alessandro Farinelli
  • Giuditta Franco
  • Manuele Bicego
چکیده

Biclustering, which can be defined as the simultaneous clustering of rows and columns in a data matrix, has received increasing attention in recent years, particularly in the field of Bioinformatics (e.g. for the analysis of microarray data). This paper proposes a novel biclustering approach, which extends the Affinity Propagation [1] clustering algorithm to the biclustering case. In particular, we propose a new exemplar based model, encoded as a binary factor graph, which allows to cluster rows and columns simultaneously. Moreover, we propose a linear formulation of such model to solve the optimization problem using Linear Programming techniques. The proposed approach has been tested by using a well known synthetic microarray benchmark, with encouraging results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Biclustering Gene Expressions Using Factor Graphs and the Max-Sum Algorithm

Biclustering is an intrinsically challenging and highly complex problem, particularly studied in the biology field, where the goal is to simultaneously cluster genes and samples of an expression data matrix. In this paper we present a novel approach to gene expression biclustering by providing a binary Factor Graph formulation to such problem. In more detail, we reformulate biclustering as a se...

متن کامل

A biclustering approach based on factor graphs and the max-sum algorithm

Biclustering represents an intrinsically complex problem, where the aim is to perform a simultaneous rowand column-clustering of a given data matrix. Some recent approaches model this problem using factor graphs, so to exploit their ability to open the door to efficient optimization approaches for well designed function decompositions. However, while such models provide promising results, they ...

متن کامل

Mistake Bounds for Binary Matrix Completion

We study the problem of completing a binary matrix in an online learning setting. On each trial we predict a matrix entry and then receive the true entry. We propose a Matrix Exponentiated Gradient algorithm [1] to solve this problem. We provide a mistake bound for the algorithm, which scales with the margin complexity [2, 3] of the underlying matrix. The bound suggests an interpretation where ...

متن کامل

Biclustering Sparse Binary Genomic Data

Genomic datasets often consist of large, binary, sparse data matrices. In such a dataset, one is often interested in finding contiguous blocks that (mostly) contain ones. This is a biclustering problem, and while many algorithms have been proposed to deal with gene expression data, only two algorithms have been proposed that specifically deal with binary matrices. None of the gene expression bi...

متن کامل

cHawk: An Efficient Biclustering Algorithm based on Bipartite Graph Crossing Minimization

Biclustering is a very useful data mining technique for gene expression analysis and profiling. It helps identify patterns where different genes are co-related based on a subset of conditions. Bipartite Spectral partitioning is a powerful technique to achieve biclustering but its computation complexity is prohibitive for applications dealing with large input data. We provide a connection betwee...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014